Static Versus Dynamic Sampling for Data Mining
نویسندگان
چکیده
As data warehouses grow to the point where one hundred gigabytes is considered small, the computational efficiency of data-mining algorithms on large databases becomes increasingly important. Using a sample from the database can speed up the datamining process, but this is only acceptable if it does not reduce the quality of the mined knowledge. To this end, we introduce the “Probably Close Enough” criterion to describe the desired properties of a sample. Sampling usually refers to the use of static statistical tests to decide whether a sample is sufficiently similar to the large database, in the absence of any knowledge of the tools the data miner intends to use. We discuss dyrz~mic sampling methods, which take into account the mining tool being used and can thus give better samples. We describe dynamic schemes that observe a mining tool’s performance on training samples of increasing size and use these results to determine when a sample is sufficiently large. We evaluate these sampling methods on data from the UC1 repository and conclude that dynamic sampling is preferable.
منابع مشابه
Monologism of Hofstede’s Static Model vs Dialogism of Fang’s Dynamic Model: Contradictory Value Configuration of Cultures through the Case Study of Farsi Proverbs
Among various cultural models, the dichotomy of static versus dynamic models has provided a fertile ground for research. Although a number of static models are suggested, the dominant trend in almost all static models is provided by Hofstede who focuses on cultural differences along four major dimensions (power distance, individualism, uncertainty avoidance, and masculinity) and reduces “the co...
متن کاملEffect of Micro-Structure on Fatigue Behavior of Intact Rocks under Completely Reversed Loading
Rock formations and structures can be subjected to both static and dynamic loadings. Static loadings resulting from different sources such as gravity and tectonic forces and dynamic forces are intermittently transmitted via vibrations of the earth’s crust, through major earthquakes, rock bursts, rock blasting and drilling and also, traffic. Reaction of rocks to cyclic and repetitive stresses re...
متن کاملComparison of pseudo-static, Newmark and dynamic response analysis of the final pit wall of Sungun copper mine
Sungun Copper Mine is located in an area with a high level of seismic hazard. Most recently, the Ahar-Varzeqan earthquake with a magnitude of 6.2 on Richter scale occurred on August 11, 2012; at a distance about 40 kilometers away from the mine. Nevertheless, the seismic stability of the final pit wall has not been comprehensively reviewed. In this research, the southwestern wall of the final p...
متن کاملEffect of Active Dynamic Versus Passive Static Stretching on Hamstring Muscle Tightness in Healthy Female Students: A Randomized Trial Study
Background: For decades, static stretching has been the standard benchmark for training programs, because it has been shown to increase flexibility compared with other methods of stretching. Objective: The current study investigated and compared the effects of active dynamic stretching and passive static stretching on hamstring tightness. Me...
متن کامل30th International Conference on Ground Control in Mining
Blasting operations generate seismic effects in underground mines. These effects apply additional dynamic loads on the support system, which should bear both static and dynamic loads. Static loads are caused by the weight of the superincumbent strata, while dynamic loads occur as a result of blasting in the mining area. Identification of the origin and determination of the support system behavi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996